Contents

Last modified: 2021-05-11 20:00:15

1 Introduction

1.1 Background

In the context of the semantic web, 構造化データは、決まりに則って記述され、データ間に参照関係を持たせたデータを指す。つまりは、メタデータを付与することで、機械可読性を高めたデータである。 その代表的な構造化データあるいは構造化された知識の例として、 オントロジーやwikidataやDBpediaなどのLinked Open Dataが挙げられる。 構造化データは、しばしばRDF data modelによって記述される。

興味がある対象・ドメインを、構造化された知識によって理解したいというニーズがある。 しかし、興味がある対象は、しばしば、非構造化データ、 つまりは、テキストや語彙のリストと与えられる曖昧なデータ範囲にて記述される。 そのため、それらのデータ間のマッピングには、大きなギャップがある。

オントロジーやLODを熟知した専門家でない限り、 初期の段階で、興味がある対象を明確に定義して、 構造化データとの対応付けを行うことは困難である。

そこで、興味対象であるドメインの構造化データの初期モデル構築を サポートするために、 興味対象の小規模な語彙リストをもとに、 対応する構造化データのサブセット抽出する ツールセットを構築した。

Overview of the domain ontology construction.

Figure 1: Overview of the domain ontology construction

本チュートリアルでは、実際の事例として、 LODから構造化データを取得する手順について紹介する。

1.2 agGraphSearch package

The agGraphSearch package is a tool-set to support the construction of domain ontology. This package provides a methodology for extracting target domain concepts from a large-scale public Linked Open Data (LOD) system. In the proposed method, the class-related hierarchy of the domain concept by the occurrences of common upper-level entities and the chain of those path relationships is obtained. The proposed method was described in Figure 1.

Overview of the upper-level concept graph and analysis algorithm. The numbers in the nodes indicate the number of search entities that exist in the subordinate concepts.

Figure 2: Overview of the upper-level concept graph and analysis algorithm
The numbers in the nodes indicate the number of search entities that exist in the subordinate concepts.

As an example of class hierarchy extraction from LOD, this short tutorial provides a workflow to obtain and visualize conceptual hierarchies related to leukemia from wikidata endpoint using its some entity labels.

Overview of the workflow of the proposed method was descrived in Figure 2.

Overview of the workflow of the proposed method

Figure 3: Overview of the workflow of the proposed method

This result is similar to the network graph obtained with wikidata graph builder.

1.3 Getting started

Once agGraphSearch is installed, it can be loaded by the following command.

#install
if(!require("agGraphSearch")){
  install.packages( "devtools" )
  devtools::install_github( "kumeS/agGraphSearch" )
}

#load
library("agGraphSearch")

2 Workflow for searching the graph for leukemia.

2.2 SPARQL query (1) counting labels and class relations

Data model for the Wikidata class hierarchy

Figure 4: Data model for the Wikidata class hierarchy

In this tutorial, the data model for class hierarchies in Wikidata will be mainly focused. It is shown in Figure 3. The class hierarchy of Wikidata is represented using the properties of subClassOf (wdt:P279) and instanceOf (wdt:P31) as a conceptual relationship between entities. In addition, the Wikidata entities are represented by IDs called QIDs. In this tutorial, in addition to QIDs, we used the property relations of representative name (rdfs:label) and alias (skos:altLabel), which represent links to label information of QIDs.

2.2.1 Check SPARQL query

ter00 <- terms[1]

#check Query
CkeckQuery_agCount_Label_Num_Wikidata_P279_P31(Entity_Name = ter00)
## EndPoint:
## http://kozaki-lab.osakac.ac.jp/agraph/NEDO_pj
## Prefix:
## PREFIX wd: <http://www.wikidata.org/entity/>
## PREFIX wdt: <http://www.wikidata.org/prop/direct/>
## PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
## PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
## PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
## PREFIX owl: <http://www.w3.org/2002/07/owl#>
## PREFIX dct: <http://purl.org/dc/terms/>
## PREFIX foaf: <http://xmlns.com/foaf/0.1/>
## PREFIX wikibase: <http://wikiba.se/ontology#>
## ```````````````````````````````````````````
## ### 001 ###
## ```````````````````````````````````````````
## SELECT (count(distinct ?subject) as ?Count_As_Label)
## From <http://wikidata_nearly_full_201127> 
## WHERE {
## ?subject rdfs:label "acute lymphocytic leukemia"@en. 
## }
## ```````````````````````````````````````````
## ### 002 ###
## ```````````````````````````````````````````
## SELECT (count(distinct ?subject) as ?Count_As_AltLabel)
## From <http://wikidata_nearly_full_201127> 
## WHERE {
## ?subject skos:altLabel "acute lymphocytic leukemia"@en. 
## }
## ```````````````````````````````````````````
## ### 003 ###
## ```````````````````````````````````````````
## SELECT  (count(distinct ?parentClass ) as ?Count_Of_ParentClass_Label)
## From <http://wikidata_nearly_full_201127> 
## WHERE {
## ?subject rdfs:label "acute lymphocytic leukemia"@en. 
## ?subject wdt:P279 ?parentClass.
## }
## ```````````````````````````````````````````
## SELECT  (count(distinct ?parentClass ) as ?Count_Of_ParentClass_altLabel)
## From <http://wikidata_nearly_full_201127> 
## WHERE {
## ?subject skos:altLabel "acute lymphocytic leukemia"@en. 
## ?subject wdt:P279 ?parentClass.
## }
## ```````````````````````````````````````````
## ### 004 ###
## ```````````````````````````````````````````
## SELECT  (count(distinct ?childClass ) as ?Count_Of_ChildClass_Label)
## From <http://wikidata_nearly_full_201127> 
## WHERE {
## ?subject rdfs:label "acute lymphocytic leukemia"@en. 
## ?childClass wdt:P279 ?subject.
## }
## ```````````````````````````````````````````
## SELECT  (count(distinct ?childClass ) as ?Count_Of_ChildClass_altLabel)
## From <http://wikidata_nearly_full_201127> 
## WHERE {
## ?subject skos:altLabel "acute lymphocytic leukemia"@en. 
## ?childClass wdt:P279 ?subject.
## }
## ```````````````````````````````````````````
## ### 005 ###
## ```````````````````````````````````````````
## SELECT  (count(distinct ?instance ) as ?Count_InstanceOf_Label)
## From <http://wikidata_nearly_full_201127> 
## WHERE {
## ?subject rdfs:label "acute lymphocytic leukemia"@en. 
## ?subject wdt:P31 ?instance.
## }
## ```````````````````````````````````````````
## SELECT  (count(distinct ?instance ) as ?Count_InstanceOf_altLabel)
## From <http://wikidata_nearly_full_201127> 
## WHERE {
## ?subject skos:altLabel "acute lymphocytic leukemia"@en. 
## ?subject wdt:P31 ?instance.
## }
## ```````````````````````````````````````````
## ### 006 ###
## ```````````````````````````````````````````
## SELECT  (count(distinct ?instance ) as ?Count_Has_Instance_Label)
## From <http://wikidata_nearly_full_201127> 
## WHERE {
## ?subject rdfs:label "acute lymphocytic leukemia"@en. 
## ?instance wdt:P31 ?subject.
## }
## ```````````````````````````````````````````
## SELECT  (count(distinct ?instance ) as ?Count_Has_Instance_altLabel)
## From <http://wikidata_nearly_full_201127> 
## WHERE {
## ?subject skos:altLabel "acute lymphocytic leukemia"@en. 
## ?instance wdt:P31 ?subject.
## }
## ```````````````````````````````````````````
#Endpoint
agGraphSearch::KzLabEndPoint_Wikidata$EndPoint
#Graph id
agGraphSearch::KzLabEndPoint_Wikidata$FROM

#run SPARQL
#library(SPARQL)
res <- agCount_Label_Num_Wikidata_P279_P31(Entity_Name = ter00)
res

#View table
#agTableDT(res, Width = "100px", Transpose = TRUE, AutoWidth=FALSE)

2.2.2 Counting labels and class relations with a for-loop

This program executes SPARQL with a for-loop.

Inputs are 3 terms.

#create an empty variable
m <- c()

#Run
for(n in 1:length(terms)){
#message(n)
m[[n]] <-agCount_Label_Num_Wikidata_P279_P31(Entity_Name = terms[n])
}

#convert list to data.frame
fm <- ListDF2DF(m)

#View the data
#agTableDT(fm, Width = "100px", Transpose = TRUE, AutoWidth=FALSE)

2.2.3 Extract only results with label and upper-level class

fm1 <- fm[c(fm$Hit_Label > 0),]
fm2 <- fm1[c(fm1$Hit_ALL > 0),]

#dim(fm)
#dim(fm1)
#dim(fm2)

2.2.4 Assigning Label information to QID

Lab01 <- fm2$LABEL

#Check Query
CkeckQuery_agWD_Alt_Wikidata(Lab01[1])
## EndPoint:
## http://kozaki-lab.osakac.ac.jp/agraph/NEDO_pj
## Prefix:
## PREFIX wd: <http://www.wikidata.org/entity/>
## PREFIX wdt: <http://www.wikidata.org/prop/direct/>
## PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
## PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
## PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
## PREFIX owl: <http://www.w3.org/2002/07/owl#>
## PREFIX dct: <http://purl.org/dc/terms/>
## PREFIX foaf: <http://xmlns.com/foaf/0.1/>
## PREFIX wikibase: <http://wikiba.se/ontology#>
## ```````````````````````````````````````````
## SELECT distinct ?subject  
## From <http://wikidata_nearly_full_201127> 
## WHERE {
## optional{ ?subject rdfs:label "acute lymphocytic leukemia"@en. }
## optional{ ?subject skos:altLabel "acute lymphocytic leukemia"@en. }
## }
## ```````````````````````````````````````````

2.2.5 Retry SPARQL by QID

#View query
CkeckQuery_agCount_ID_Num_Wikidata_QID_P279_P31(QID[1])
## EndPoint:
## http://kozaki-lab.osakac.ac.jp/agraph/NEDO_pj
## Prefix:
## PREFIX wd: <http://www.wikidata.org/entity/>
## PREFIX wdt: <http://www.wikidata.org/prop/direct/>
## PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
## PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
## PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
## PREFIX owl: <http://www.w3.org/2002/07/owl#>
## PREFIX dct: <http://purl.org/dc/terms/>
## PREFIX foaf: <http://xmlns.com/foaf/0.1/>
## PREFIX wikibase: <http://wikiba.se/ontology#>
## ```````````````````````````````````````````
## SELECT  (count(distinct ?parentClass) as ?Count_Of_ParentClass)
## From <http://wikidata_nearly_full_201127> 
## WHERE {
## wd:Q180664 wdt:P279 ?parentClass.
## }
## ```````````````````````````````````````````
## SELECT  (count(distinct ?childClass) as ?Count_Of_ChildClass)
## From <http://wikidata_nearly_full_201127> 
## WHERE {
## ?childClass wdt:P279 wd:Q180664.
## }
## ```````````````````````````````````````````
## SELECT  (count(distinct ?instance) as ?Count_InstanceOf)
## From <http://wikidata_nearly_full_201127> 
## WHERE {
## wd:Q180664 wdt:P31 ?instance.
## }
## ```````````````````````````````````````````
## SELECT  (count(distinct ?instance) as ?Count_Has_Instance)
## From <http://wikidata_nearly_full_201127> 
## WHERE {
## ?instance wdt:P31 wd:Q180664.
## }
## ```````````````````````````````````````````
#create an empty variable
QID_res <- c()

#Try SPARQL with QID
for(n in 1:length(Lab01)){
QID_res[[n]] <- agCount_ID_Num_Wikidata_QID_P279_P31(QID[n])
}

#convert list to data.frame
QID_res2 <- ListDF2DF(QID_res)

#check it
head(QID_res2)
dim(QID_res2)
colnames(QID_res2)

#All
table(QID_res2$Hit_All)
table(QID_res2$Hit_All > 0)
table(QID_res2$Hit_All_Parent > 0)
table(QID_res2$Hit_All_Child > 0)

#View the results
#agTableDT(QID_res2, Width = "100px", Transpose = TRUE, AutoWidth=FALSE)

2.3 SPARQL query (2) Excluding the particular relations

This step search for neighboring entities and properties, and then count their presence or absence. If the particular entity exists in the neighbor, the search entity is excluded. It is shown in Figure 4.

Ex. examples of neighboring entities - Family name (wd:Q101352) - movie (wd:Q11424)

Ex. examples of neighboring properties - sex or gender (wdt:P21) - located in the administrative territorial entity (wdt:P131)

Exclusion of non-applicable entities by relationships with the adjacent entity and the property

Figure 5: Exclusion of non-applicable entities by relationships with the adjacent entity and the property

#For neighboring entities
#Check query
CkeckQuery_agCount_ID_Prop_Obj_Wikidata_vP( Entity_ID=QID[1], Object="wd:Q101352" )
## EndPoint:
## http://kozaki-lab.osakac.ac.jp/agraph/NEDO_pj
## Prefix: 
## PREFIX wd: <http://www.wikidata.org/entity/>
## PREFIX wdt: <http://www.wikidata.org/prop/direct/>
## PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
## PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
## PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
## PREFIX owl: <http://www.w3.org/2002/07/owl#>
## PREFIX dct: <http://purl.org/dc/terms/>
## PREFIX foaf: <http://xmlns.com/foaf/0.1/>
## PREFIX wikibase: <http://wikiba.se/ontology#>
## ```````````````````````````````````````````
## SELECT (count(distinct ?p) as ?Count)
## From <http://wikidata_nearly_full_201127> 
## WHERE {
## wd:Q180664 ?p wd:Q101352.
## } 
## ```````````````````````````````````````````
#create an exclusion QID list without "wd:"
ExcluQ <- c("Q101352", "Q11424")
NumQ <- length(ExcluQ)
QIDdf <- data.frame(QID=QID)

#run SPARQL
for(m in seq_len(NumQ)){
#print(ExcluQ[m])

res <- c()
for(n in seq_len(length(QID))){
res[[n]] <- agCount_ID_Prop_Obj_Wikidata_vP(Entity_ID=QID[n], Object=paste0("wd:", ExcluQ[m]))
}
res1 <- ListDF2DF(res)
eval(parse(text=paste0("QIDdf$", ExcluQ[m], " <- c(as.numeric(unlist(res1)) > 0)")))
}

#View the result
agTableKB(QIDdf)
QID Q101352 Q11424
wd:Q180664 FALSE FALSE
wd:Q5113976 FALSE FALSE
wd:Q55790812 FALSE FALSE
#For neighboring properties
#Check query
CkeckQuery_agCount_ID_Prop_Obj_Wikidata_vO( Entity_ID=QID[1], Property="wdt:P21")
## EndPoint:
## http://kozaki-lab.osakac.ac.jp/agraph/NEDO_pj
## Prefix: 
## PREFIX wd: <http://www.wikidata.org/entity/>
## PREFIX wdt: <http://www.wikidata.org/prop/direct/>
## PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
## PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
## PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
## PREFIX owl: <http://www.w3.org/2002/07/owl#>
## PREFIX dct: <http://purl.org/dc/terms/>
## PREFIX foaf: <http://xmlns.com/foaf/0.1/>
## PREFIX wikibase: <http://wikiba.se/ontology#>
## ```````````````````````````````````````````
## SELECT (count(distinct ?o) as ?Count)
## From <http://wikidata_nearly_full_201127> 
## WHERE {
## wd:Q180664 wdt:P21 ?o.
## } 
## ```````````````````````````````````````````
#create an exclusion list without "wdt:"
ExcluP <- c("P21", "P131")
NumP <- length(ExcluP)

#run SPARQL
for(m in seq_len(NumP)){
print(ExcluP[m])

res <- c()
for(n in seq_len(length(QID))){
res[[n]] <- agCount_ID_Prop_Obj_Wikidata_vO(Entity_ID=QID[n], Property=paste0("wdt:", ExcluP[m]))
}
res1 <- ListDF2DF(res)
eval(parse(text=paste0("QIDdf$", ExcluP[m], " <- c(as.numeric(unlist(res1)) > 0)")))
}

#view the result
agTableKB(QIDdf)

2.4 SPARQL query (3) Examining the upper-level class relations

# instanceOf (wdt:P31)
CkeckQuery_agWD_ID_Prop_Obj_Wikidata_vO(Entity_ID=QID[n], Property="wdt:P31")
## EndPoint:
## http://kozaki-lab.osakac.ac.jp/agraph/NEDO_pj
## Prefix: 
## PREFIX wd: <http://www.wikidata.org/entity/>
## PREFIX wdt: <http://www.wikidata.org/prop/direct/>
## PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
## PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
## PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
## PREFIX owl: <http://www.w3.org/2002/07/owl#>
## PREFIX dct: <http://purl.org/dc/terms/>
## PREFIX foaf: <http://xmlns.com/foaf/0.1/>
## PREFIX wikibase: <http://wikiba.se/ontology#>
## ```````````````````````````````````````````
## SELECT distinct ?o ?oLabelj ?oLabele 
## From <http://wikidata_nearly_full_201127> 
## WHERE {
## wd:Q55790812 wdt:P31 ?o .
## ?o rdfs:label ?oLabelj . filter(LANG(?oLabelj) = "ja").
## ?o rdfs:label ?oLabele . filter(LANG(?oLabele) = "en").
## }
## ```````````````````````````````````````````
#create an empty variable
res3 <- c()

#run SPARQL
for(n in seq_len(length(QID))){
res3[[n]] <- agWD_ID_Prop_Obj_Wikidata_vO(Entity_ID=QID[n], Property="wdt:P31")
}
# subClassOf (wdt:P279)
CkeckQuery_agWD_ID_Prop_Obj_Wikidata_vO(Entity_ID=QID[n], Property="wdt:P279")
## EndPoint:
## http://kozaki-lab.osakac.ac.jp/agraph/NEDO_pj
## Prefix: 
## PREFIX wd: <http://www.wikidata.org/entity/>
## PREFIX wdt: <http://www.wikidata.org/prop/direct/>
## PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
## PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
## PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
## PREFIX owl: <http://www.w3.org/2002/07/owl#>
## PREFIX dct: <http://purl.org/dc/terms/>
## PREFIX foaf: <http://xmlns.com/foaf/0.1/>
## PREFIX wikibase: <http://wikiba.se/ontology#>
## ```````````````````````````````````````````
## SELECT distinct ?o ?oLabelj ?oLabele 
## From <http://wikidata_nearly_full_201127> 
## WHERE {
## wd:Q55790812 wdt:P279 ?o .
## ?o rdfs:label ?oLabelj . filter(LANG(?oLabelj) = "ja").
## ?o rdfs:label ?oLabele . filter(LANG(?oLabele) = "en").
## }
## ```````````````````````````````````````````
#create an empty variable
res4 <- c()

#run SPARQL
for(n in seq_len(length(QID))){
res4[[n]] <- agWD_ID_Prop_Obj_Wikidata_vO(Entity_ID=QID[n], Property="wdt:P279")
}

#convert list to data.frame
res3b <- ListDF2DF(res3)
res4b <- ListDF2DF(res4)
res <- rbind(res3b, res4b)

#remove rows with NA on "o" col
res.na <- res[!is.na(res$o),]

#View the result
#agTableDT(res.na, Width = "100px", Transpose = FALSE, AutoWidth=FALSE)

2.5 SPARQL query (4) Searching for the upper-level concepts

2.5.1 Obtaining the upper-level concepts from the input terms

#create a new folder
if(!dir.exists("02_Short_Out")){dir.create("02_Short_Out")}

#create an empty variable
res5 <- c()

#run SPARQL; search the upper-level classes
for(n in 1:length(QID)){
  message(n)
  res5[[n]] <- PropertyPath_GraphUp_Wikidata(Entity_ID = QID[n], Depth = 30)  
}

#check results
head(res5[[1]])
agTableDT(res5[[1]])

#Count rows
checkNrow_af(res5)

#Detect loop
checkLoop_af(res5)

#Save
saveRDS(res5,
        file="./02_Short_Out/Individual_upGraph.Rdata",
        compress = TRUE)

An alternative way,

#run SPARQL with purrr::map function
res5m <- purrr::map(QID, PropertyPath_GraphUp_Wikidata, Depth = 30)

#check results
#Count rows
checkNrow_af(res5m)

#Detect loop
checkLoop_af(res5m)

2.5.2 Individual network diagrams

#create a new folder
if(!dir.exists("Short_Out_vis")){dir.create("Short_Out_vis")}

#create networks
for(n in 1:length(res5)){
#n <- 1
a <- agIDtoLabel_Wikidata(Entity_ID = QID[n])
if(is.na(a[,2])){a[,2] <- a[,3]}

Lab00 <- paste(a[,c(2, 1)], collapse = ".")
FileName <- paste0("agVisNetwork_", Lab00,"_", format(Sys.time(), "%y%m%d"),".html")

#run the network creation
agVisNetwork(Graph=res5[[n]], 
             Selected=Lab00, 
             Browse=FALSE, 
             Output=TRUE,
             FilePath=FileName)
Sys.sleep(1)

filesstrings::file.move(files=FileName,
                        destinations="./Short_Out_vis",
                        overwrite = TRUE)

if(dir.exists(paste0("./agVisNetwork_", formatC(n, flag="0", width=4), "_", Lab00, "_files"))){
  system(paste0('rm -rf "./agVisNetwork_', formatC(n, flag="0", width=4), '_', Lab00, '_files"'))
}}

#View the results
#browseURL(paste0("./Short_Out_vis/", dir("Short_Out_vis", pattern=".html")[1]))
#browseURL(paste0("./Short_Out_vis/", dir("Short_Out_vis", pattern=".html")[2]))
#browseURL(paste0("./Short_Out_vis/", dir("Short_Out_vis", pattern=".html")[3]))

2.5.3 Merged network diagrams

#Merge them to one dataset
res6 <- ListDF2DF(res5)

#check NAs
table(is.na(res6))

#Delete deplicates
res6d <- Exclude_Graph_duplicates(input=res6)

#check dim
dim(res6); dim(res6d)

#Save
saveRDS(res6d,
        file="./02_Short_Out/Merged_upGraph.Rdata",
        compress = TRUE)

#run the network creation
FileName <- paste0("agVisNetwork_Merged", "_", 
                   format(Sys.time(), "%y%m%d"),".html")

agVisNetwork(Graph=res6d,
             Browse=FALSE,
             Output=TRUE,
             FilePath=FileName)
filesstrings::file.move(files=FileName,
                        destinations="./Short_Out_vis",
                        overwrite = TRUE)

#View the results
#browseURL(paste0("./Short_Out_vis/", FileName))
Merged network diagrams for search terms related to leukemia

Figure 6: Merged network diagrams for search terms related to leukemia

2.5.4 Identification of the common upper-level entities using individual networks

The common upper-level concept is defined based on the edge list of triples obtained above.

##Graph data without the uplicates
#Number of entities
(E01 <- length(unique(c(res6d$subject, res6d$parentClass))))
#Number of labels
(E02 <- length(unique(c(res6d$subjectLabel, res6d$parentClassLabel))))
#Number of Triples
(E03 <- length(unique(res6d$triples)))

#Gathering the parent concepts
upEntity <- unlist(purrr::map(res5, function(x){unique(x$parentClass)}))

#convert it to data frame
Count_upEntity <- table(upEntity)
Count_upEntity_DF <- data.frame(parentClass=names(Count_upEntity),
                                Freq=as.numeric(Count_upEntity), 
                                row.names = 1:length(Count_upEntity),
                                stringsAsFactors = F)

#Count and view table
agTableDT(Count_upEntity_DF, Transpose = F, AutoWidth = FALSE)

#Count Freq
table(Count_upEntity_DF$Freq)

#extarct parentClass & parentClassLabel from the merged dataset
Dat <- data.frame(res6d[,c(colnames(res6d) == "parentClass" | 
                          colnames(res6d) == "parentClassLabel")], 
                  stringsAsFactors = F)
head(Dat)

#Delete the deplicates
Dat0 <- Exclude_duplicates(Dat, 1)
head(Dat0)

#define the common upper-level entities
dim(Count_upEntity_DF); dim(Dat0)
Count_upEntity_DF2 <- Cutoff_FreqNum(input1=Count_upEntity_DF, 
                                     input2=Dat0, 
                                     By="parentClass", 
                                     Sort="Freq", 
                                     FreqNum=1)

#check the results
head(Count_upEntity_DF2, n=10)
table(Count_upEntity_DF2$Freq)

#save
saveRDS(Count_upEntity_DF2,
        file = "./02_Short_Out/Count_upEntity_DF2.Rdata", compress = TRUE)
readr::write_excel_csv(Count_upEntity_DF2,
                       file=paste("./02_Short_Out/Count_upEntity_DF2.csv", sep=""))

#Calculation of inclusion rate
QID <- QIDdf$QID

##QID
qid <- unique(res6d$subject, res6d$parentClass)
b <- setdiff(QID, qid)
b; length(b)

##rdfsLabel
#RdfsLabel <- unique(res6d$subjectLabel, res6d$parentClassLabel)

2.5.5 Results for the common upper-level entities

FileName <- paste0("./FrequencyGraph_", format(Sys.time(), "%y%m%d_%H%M"),".html")

pc_plot(input=Count_upEntity_DF2, 
        SaveFolder="Short_Out_vis", 
        FileName=FileName, 
        IDnum=3)

#View the results
#browseURL(paste0("./Short_Out_vis/", dir("Short_Out_vis", pattern="FrequencyGraph_")[2]))
#browseURL(paste0("./Short_Out_vis/", dir("Short_Out_vis", pattern="FrequencyGraph_")[1]))

2.6 XXX

2.6.1 YYY

2.7 XXX

2.7.1 YYY

Session information

## R version 4.0.2 (2020-06-22)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Catalina 10.15.7
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] ja_JP.UTF-8/ja_JP.UTF-8/ja_JP.UTF-8/C/ja_JP.UTF-8/ja_JP.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] agGraphSearch_0.99.1 SPARQL_1.16          RCurl_1.98-1.3      
## [4] XML_3.99-0.6         EBImage_4.32.0       BiocStyle_2.18.1    
## 
## loaded via a namespace (and not attached):
##  [1] locfit_1.5-9.4      lattice_0.20-41     tidyr_1.1.3        
##  [4] visNetwork_2.0.9    fftwtools_0.9-11    png_0.1-7          
##  [7] assertthat_0.2.1    digest_0.6.27       utf8_1.2.1         
## [10] R6_2.5.0            tiff_0.1-8          filesstrings_3.2.2 
## [13] evaluate_0.14       httr_1.4.2          ggplot2_3.3.3      
## [16] highr_0.9           pillar_1.6.0        rlang_0.4.10       
## [19] lazyeval_0.2.2      data.table_1.14.0   jquerylib_0.1.4    
## [22] DT_0.18             rmarkdown_2.7       readr_1.4.0        
## [25] stringr_1.4.0       htmlwidgets_1.5.3   franc_1.1.3        
## [28] igraph_1.2.6        munsell_0.5.0       compiler_4.0.2     
## [31] xfun_0.22           pkgconfig_2.0.3     BiocGenerics_0.36.1
## [34] htmltools_0.5.1.1   tidyselect_1.1.0    tibble_3.1.1       
## [37] bookdown_0.22       viridisLite_0.4.0   fansi_0.4.2        
## [40] crayon_1.4.1        dplyr_1.0.5         bitops_1.0-7       
## [43] grid_4.0.2          jsonlite_1.7.2      formattable_0.2.1  
## [46] gtable_0.3.0        lifecycle_1.0.0     DBI_1.1.1          
## [49] magrittr_2.0.1      scales_1.1.1        stringi_1.5.3      
## [52] bslib_0.2.4         ellipsis_0.3.2      vctrs_0.3.8        
## [55] generics_0.1.0      tools_4.0.2         glue_1.4.2         
## [58] purrr_0.3.4         hms_1.0.0           jpeg_0.1-8.1       
## [61] networkD3_0.4       abind_1.4-5         parallel_4.0.2     
## [64] yaml_2.2.1          colorspace_2.0-0    BiocManager_1.30.12
## [67] strex_1.4.2         plotly_4.9.3        knitr_1.33         
## [70] sass_0.3.1